Data cleaning and pre-processing

This section shows the main steps that have been applied to pre-process the raw data.

aCDOM spectra

  • The CDOM spectra were modelled according to the information in Babin (2003).

    • acdom spectra were re-fitted using the complete data (i.e. between 350-500 nm) because the data in all_abs_transpose.txt started at 380 nm.
  • Average background values calculated between 683-687 nm and subtracted from each spectrum.

  • Some files were in binary format, so I could not open them (ex.: C2001000.YSA).

  • Some spectra start at 300 nm while others at 350 nm.

  • Calculated the correlation between the measured and the fitted values.

    • Fits with R2 lower than 0.95 were removed from the data.
  • Absorption spectra with any negative values below 500 nm were removed.

  • Exported the complete spectra (350-700 nm): both the raw and the modelled data.

Phytoplankton and non-algal absorption

  • Absorption spectra with any negative values below 500 nm were removed.

Irradiance

  • There were negative values in the irradiance data (Ed, Eu, Kd, Ku). I have cleaned the data by setting these negative values to NA.

This graph shows the number of negative values for Ed by wavelength.

  • Example of a spectral profile with negative values.

  • Eu is in fact Eu0- that was estimated using a two-exponential function model.

  • Ed is in fact Ed0- calculated from 0.94 x ed0+.

  • There are differences in wavelengths among cruises. I have not found any information in the report concerning channel change across the missions.

A2 C1 C2 C3 C4 C5 C6
411 411 411 411 411 411 411
443 443 443 443 443 443 443
456 456 456 456 456 456 456
490 490 490 490 490 490 490
509 509 509 509 509 509 509
532 532 532 532 532 532 532
559 556 559 559 559 559 559
619 590 619 619 619 619 619
665 665 665 665 665 665 665
683 683 683 683 683 683 683
705 705 705 705 705 705 705
779 779 779 779 779 779 779
866 866 866 866 866 866 866
A2 C1 C2 C3 C4 C5 C6
559 556 559 559 559 559 559
619 590 619 619 619 619 619

Reflectance

  • Reflectance values outside the 0-1 range were set to NA.

AC9

  • Negative values in a, c, bp, a_dissolved and c_dissolved have been set to NA.

  • a(715) was used as the baseline, that is why the values are always at 0 (see next graph).

Other stuff

  • Extracted extra variables (DOC, AQY) from Massimo 2000.

Data sampling

Just some graphs to visualize the data. Note that the same colour palette will be used to represent the areas in all graphics.

Temporal sampling

This graph shows when the sampling was performed in the different areas. For instance, we can see that a large fraction of the measurements were made in September of 1998.

Geographical map

There is a total of 424 different stations were sampled during the COASTLOOC expeditions.

  • Note that there are two stations without geographical coordinates: C2001000, C2002000.

Bathymetry

I have extracted the bathymetry at each sampling locations. This boxplot provide a general picture of the bathymetry per area. Data from https://download.gebco.net/.

Land-to-sea gradient

We could also present the data in relation with its distance to the land to get an overview of the landscape on optical quantities. For instance, acdom will likely be higher for stations located close to the shore because of the terrestrial influence.

Available variables

This graph shows an overview of the available variables (excluding radiometric measurements).

Absorption measurements

Overview of the averaged absorption spectra for each area.

Comparing acdom443 for the different areas shows that there is a clear open to coastal gradient.

We can see that the DOC follows the same pattern as acdom443.

We can also use scatter plots to further explore the relationships among variables.

Relationships between some pigments.

aphy

We could also assess the goodness of the relationships between total chlorophyll-a and phytoplankton absorption for each region.

anap

ap

acdom

Absorption partition

In this section I am using the same three stations as in Oubelkheir et al. (2007) to explore the additive contributions of each type of absorption.

  • For station C6024000, a_p is lower than a_cdom around 400 nm and 550 nm. Should we use this to filter out problematic spectra?

  • There are some obvious problems with a_cdom measurements. See C6024000 where there is a bump in absorption around 550 nm.

  • This is a ternary plot showing the relative contribution of \(a_{\phi}(443)\), \(a_{\text{NAP}}(443)\) and \(a_{\text{CDOM}}(443)\). I think there are interesting patterns in this graph.

Spectral slopes

This graph compares the spectral slopes of both CDOM and NAP absorption spectra.

Irradiance

Ed

Eu

Kd

Ku

Reflectance

AC9

Absorption

Beam attenuation

Scattering

Orientation of the paper

  • The data is a mix of temporal and spatial observations, so how should we present the data?

    • By area?

Journal candidates

Figures for the paper

This section shows the figures that I think should be included in the data paper.

Figure 1

Figure 1: Map of the sampling stations.

Figure 2

Figure 2: (A) Overview of the temporal sampling for the seven areas. The numbers in the circles indicate the number of visited stations each month. (B) Boxplot showing the bathymetry at the sampling locations by area.

Figure 3

Figure 3: (A) Total chlorophyll-a and (B) particulate organic carbon across the sampled areas.

  • Is it normal there is no data for Med. Sea (Case 1)?

Figure 4

Figure 4: (A) Average total particulate (\(a_\text{p}\)), (B) non-algal (\(a_\text{NAP}\)), (C) phytoplankton (\(a_{\phi}\)) and (D) chromophoric dissolved organic matter (\(a_\text{CDOM}\)) absorption spectra in each area. (E) \(a_\text{CDOM}(350)\) along the westernmost transect in the North Sea (see Fig. 1B).

  • Is it normal there is no data for Med. Sea (Case 1)?

Figure 5

Figure 5: (A) Particulate scattering coefficient at 440 nm (\(b_{b}(440)\)) and (B) attenuation coefficient for downward irradiance at 443 nm (\(K_{d}(443)\)) across the sampled areas.

Figure 6

Scatterplots showing relationships among different selected variables. (A) Particulate organic carbon (POC) and (B) phytoplankton absorption at 443 nm (\(a_{\phi}(443)\)) against total chlorophyll-a. (C) Downward irradiance at 443 nm (\(E_{d}(443)\)) and (D) particulate scattering at 440 nm (\(b_{b}(440)\) against particulate organic carbon.

Done

  • Calculate s_nap and s_cdom. See the method in Babin (2003) where he removes some wavelengths to calculate s_nap.

  • Removed dissolved a and c from the AC9 data because there were problems with the filtering procedure during the sampling.

  • Extract bathymetry at each station.

  • Zoom on geographic areas in Fig. 1 such as figure 13 in the final report.

  • Wait for Frank to correct the bug with the mrg file where the data columns are not aligned correctly.

  • Recode Ed wavelengths from the SPMR vertical profiles as:

    • 412 -> 411
    • 510 -> 509
    • 589 -> 590
    • 666 -> 665
    • 780 -> 779

Todos

  • No absorption for Med. Sea (Case 1). Is it normal?

  • There are a lot of nutrient parameters that have values of zero. Are they true zero or they indicate missing values?

  • There are wavelength gaps in the AC9, irradiance and reflectance data. Is that normal?

  • Add units to each variable. For example depth should becomes depth_m.

  • Ternary plot to characterize the contribution of each optically substance to total absorption.

  • I do not have the backscattering data from the BB-4.

  • Some geographical positions are located on land (Adriatic Sea for example).

  • Calculate the apparent visible wavelength index (AVW) and see if it can be exploited in this paper.

  • There are duplicated Ed spectra in the data.

  • DOC vs aCDOM.

  • Same point size for outliers and observations in boxplots.

These are the duplicated Ed stations.

This is the same for Eu.

  • There are only two AC9 measurements in the Adriadic Sea. Is it normal?

References

Babin, Marcel. 2003. “Variations in the Light Absorption Coefficients of Phytoplankton, Nonalgal Particles, and Dissolved Organic Matter in Coastal Waters Around Europe.” Journal of Geophysical Research 108 (C7): 3211. https://doi.org/10.1029/2001JC000882.
Oubelkheir, Kadija, Hervé Claustre, Annick Bricaud, and Marcel Babin. 2007. “Partitioning Total Spectral Absorption in Phytoplankton and Colored Detrital Material Contributions.” Limnology and Oceanography: Methods 5 (11): 384–95. https://doi.org/10.4319/lom.2007.5.384.